Multi-Factor Duplicate Question Detection in Stack Overflow

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow

Stack Overflow (SO) has been a great source of natural language questions and their code solutions (i.e., question-code pairs), which are critical for many tasks including code retrieval and annotation. In most existing research, question-code pairs were collected heuristically and tend to have low quality. In this paper, we investigate a new problem of systematically mining question-code pairs...

متن کامل

Duplicate Question Pair Detection with Deep Learning

Determining whether two questions are asking the same thing can be challenging, as word choice and sentence structure can vary significantly. Traditional natural language processing techniques such as shingling have been found to have limited success in separating related question from duplicate questions. Using a dataset of 400,000 labeled question pairs provided by question-and-answer forum Q...

متن کامل

Stack Overflow Query Outcome Prediction

Stack Overflow’s core mission is to create an online encyclopedia for all programming knowledge. In order to ensure quality content in the face of rapid growth, community moderators frequently close low quality questions, often asked by newcomers. In order to alleviate moderator burden and ease newcomers’ transition, we devise two classifiers to predict 1) whether a question will be closed and ...

متن کامل

Ways of Asking and Replying in Duplicate Question Detection

This paper presents the results of systematic experimentation on the impact in duplicate question detection of different types of questions across both a number of established approaches and a novel, superior one used to address this language processing task. This study permits to gain a novel insight on the different levels of robustness of the diverse detection methods with respect to differe...

متن کامل

CASE-QA: Context and Syntax embeddings for Question Answering On Stack Overflow

Question answering (QA) systems rely on both knowledge bases and unstructured text corpora. Domain-specific QA presents a unique challenge, since relevant knowledge bases are often lacking and unstructured text is difficult to query and parse. This project focuses on the QUASAR-S dataset (Dhingra et al., 2017) constructed from the community QA site Stack Overflow. QUASAR-S consists of Cloze-sty...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Computer Science and Technology

سال: 2015

ISSN: 1000-9000,1860-4749

DOI: 10.1007/s11390-015-1576-4